227 research outputs found
Phase Transitions in the Pooled Data Problem
In this paper, we study the pooled data problem of identifying the labels
associated with a large collection of items, based on a sequence of pooled
tests revealing the counts of each label within the pool. In the noiseless
setting, we identify an exact asymptotic threshold on the required number of
tests with optimal decoding, and prove a phase transition between complete
success and complete failure. In addition, we present a novel noisy variation
of the problem, and provide an information-theoretic framework for
characterizing the required number of tests for general random noise models.
Our results reveal that noise can make the problem considerably more difficult,
with strict increases in the scaling laws even at low noise levels. Finally, we
demonstrate similar behavior in an approximate recovery setting, where a given
number of errors is allowed in the decoded labels.Comment: Accepted to NIPS 201
Limits on Support Recovery with Probabilistic Models: An Information-Theoretic Framework
The support recovery problem consists of determining a sparse subset of a set
of variables that is relevant in generating a set of observations, and arises
in a diverse range of settings such as compressive sensing, and subset
selection in regression, and group testing. In this paper, we take a unified
approach to support recovery problems, considering general probabilistic models
relating a sparse data vector to an observation vector. We study the
information-theoretic limits of both exact and partial support recovery, taking
a novel approach motivated by thresholding techniques in channel coding. We
provide general achievability and converse bounds characterizing the trade-off
between the error probability and number of measurements, and we specialize
these to the linear, 1-bit, and group testing models. In several cases, our
bounds not only provide matching scaling laws in the necessary and sufficient
number of measurements, but also sharp thresholds with matching constant
factors. Our approach has several advantages over previous approaches: For the
achievability part, we obtain sharp thresholds under broader scalings of the
sparsity level and other parameters (e.g., signal-to-noise ratio) compared to
several previous works, and for the converse part, we not only provide
conditions under which the error probability fails to vanish, but also
conditions under which it tends to one.Comment: Accepted to IEEE Transactions on Information Theory; presented in
part at ISIT 2015 and SODA 201
Noisy Non-Adaptive Group Testing: A (Near-)Definite Defectives Approach
The group testing problem consists of determining a small set of defective
items from a larger set of items based on a number of possibly-noisy tests, and
is relevant in applications such as medical testing, communication protocols,
pattern matching, and many more. We study the noisy version of the problem,
where the output of each standard noiseless group test is subject to
independent noise, corresponding to passing the noiseless result through a
binary channel. We introduce a class of algorithms that we refer to as
Near-Definite Defectives (NDD), and study bounds on the required number of
tests for vanishing error probability under Bernoulli random test designs. In
addition, we study algorithm-independent converse results, giving lower bounds
on the required number of tests under Bernoulli test designs. Under reverse
-channel noise, the achievable rates and converse results match in a broad
range of sparsity regimes, and under -channel noise, the two match in a
narrower range of dense/low-noise regimes. We observe that although these two
channels have the same Shannon capacity when viewed as a communication channel,
they can behave quite differently when it comes to group testing. Finally, we
extend our analysis of these noise models to the symmetric noise model, and
show improvements over the best known existing bounds in broad scaling regimes.Comment: Submitted to IEEE Transactions on Information Theor
Near-Optimal Noisy Group Testing via Separate Decoding of Items
The group testing problem consists of determining a small set of defective
items from a larger set of items based on a number of tests, and is relevant in
applications such as medical testing, communication protocols, pattern
matching, and more. In this paper, we revisit an efficient algorithm for noisy
group testing in which each item is decoded separately (Malyutov and Mateev,
1980), and develop novel performance guarantees via an information-theoretic
framework for general noise models. For the special cases of no noise and
symmetric noise, we find that the asymptotic number of tests required for
vanishing error probability is within a factor of the
information-theoretic optimum at low sparsity levels, and that with a small
fraction of allowed incorrectly decoded items, this guarantee extends to all
sublinear sparsity levels. In addition, we provide a converse bound showing
that if one tries to move slightly beyond our low-sparsity achievability
threshold using separate decoding of items and i.i.d. randomized testing, the
average number of items decoded incorrectly approaches that of a trivial
decoder.Comment: Submitted to IEEE Journal of Selected Topics in Signal Processin
Second-Order Asymptotics for the Discrete Memoryless MAC with Degraded Message Sets
This paper studies the second-order asymptotics of the discrete memoryless
multiple-access channel with degraded message sets. For a fixed average error
probability and an arbitrary point on the boundary of the
capacity region, we characterize the speed of convergence of rate pairs that
converge to that point for codes that have asymptotic error probability no
larger than , thus complementing an analogous result given previously
for the Gaussian setting.Comment: 5 Pages, 1 Figure. Follow-up paper of http://arxiv.org/abs/1310.1197.
Accepted to ISIT 201
- …